This project leverages gene expression data to predict breast cancer categories, including Normal, Benign, and Malignant. Using the National Center for Biotechnology Information (NCBI) dataset (GDS3952), the study addresses key challenges such as the "curse of dimensionality," overfitting, and multicollinearity. The aim is to create a reliable machine learning model for early detection and improved patient outcomes.